Description of formats supported by SoundCon
============================================

There are two general classes of sound sample formats, compressed and
uncompressed. Uncompressed samples are described relatively simply by the
number of channels, bits per sample and sample format. Compressed samples
are usually more complex; these are described later.

The sub-formats of each format is not described indivually because they are
simply providing you with a choice of standard sub-formats. The meanings of
the items found in the sub-menus (and the raw data interpretation menu) are
described here:

  - Mono and stereo are for one and two channel samples. Note that stereo
    samples must be interleaved in blocks of one (ie LRLRLRLR...). Formats
    like AudioWorks (which interleaves in blocks of eight) can't be
    interpreted as raw data by SoundCon and has special conversion code to
    deal with that. One or two channels are supported for output conversion
    and playback. Up to four are supported for conversion input.

  - 8 and 16 bit is for bits per sample. This only applies to the linear
    samples since -law and VIDC are defined as 8-bit formats. 12 bit
    samples are not supported.

  - Little and Big endian describes how to interpret 16-bit samples (or
    any number stored in more than one byte). Little endian means the least
    significant byte is stored first and Big endian means the most
    significant byte is stored first. Little endian is the usual format for
    formats originating from Arm and Intel based compuuters (eg Arc, PC) and
    big endian is the usual format for formats originating from Motorola
    based machines (eg Mac, Amiga).

  - Signed linear samples are stored as 2's complement numbers (ie signed)
    that are directly proportional to the instantaneous amplitude of the
    sample (ie linear).

  - Unsigned linear samples similar except that the maximum amplitude is
    added to all samples (ie 128 for 8-bit, 32768 for 16-bit) to make all
    the numbers positive (ie unsigned). In practise only the sign bit has to
    be changed to convert from signed to unsigned.

  - -law samples are logarithmic which gives them a better dynamic range
    than linear samples. Linear samples usually need to be about 12 bits to
    provide the same low-amplitude resolution as -law. This format is
    common on Unix machines.
    
  - A-law samples are a form of compression. 8-bit A-law compresses a 13-bit
    linear sample by taking the 4 most significant bits from the 12-bit
    magnitude and shifting them by 0-7 bits as defined by a further 3 bits.
    The eighth bit is the sign. This gives A-law a good dynamic range with
    few bits because it is essentially logarithmic.

  - Arc VIDC is the internal format used by the sound system on the
    Archimedes making it a logical way to store samples on the Arc. It's
    very similar to -law.


Major format summaries
----------------------

Unless otherwise noted the PlaySample module used by SoundCon can play
samples that are any combination of:

  - 8 or 16 bit
  - mono or stereo
  - linear signed, linear unsigned, mu-law, A-law or Arc VIDC

with the exception that 16 bits samples which can only be linear since
mu-law, A-law and Arc VIDC are not actually defined for anything other than
8-bit.

Playback lists use the following summaries:

  - mono/stereo 8        (8-bit linear signed or unsigned)
  - mono/stereo 16       (16-bit linear signed or unsigned)
  - mono/stereo 8 slin   (8-bit signed linear)
  - mono/stereo 8 ulin   (8-bit unsigned linear)
  - mono/stereo 16 slin  (16-bit signed linear)
  - mono/stereo 16 ulin  (16-bit unsigned linear)
  - mono/stereo mu-law   (8-bit mu-law)
  - mono/stereo A-law    (8-bit A-law)
  - mono/stereo VIDC     (8-bit Archimedes VIDC)

A summary like "mono/stereo 8/16" means all combinations are allowable, ie
"mono 8, stereo 8, mono 16, stereo 16".


The following major formats are supported by the current version of SoundCon.

  Audio IFF
  Armadeus
  ARMovie
  Sun Audio
  Audioworks
  Datavox
  IFF/8SVX
  Psion S3a
  Raw data
  VOC (Creative Voice)
  Voice module (write only)
  RIFF WAVE


Audio IFF

  Audio IFF (Interchange File Format) is a standard 'chunked' file format
  which supports a wide range of sound sub-formats. This format originated
  on Apple computers so multi-byte numbers are big endian.

  Channels: Any number of channels can be represented (in theory, up to
  2^32). SoundCon can read up to 4 channels and write up to 2.

  Bits: Anywhere from 1-32 bits per sample can be represented. SoundCon can
  read from 1-24 bits and can write 8 or 16 bits.

  Format: The format is always linear signed left justified to make it byte
  aligned. eg for 4-bit samples, the 4 most significant bits of a byte are
  used. For 12 bits sample, the 12 most significant of 2 bytes are used.
  Samples are always byte aligned (ie they are not packed).

  Playback: Mono/stereo 8 slin/16 slin


Armadeus

  Armadeus is a simple Archimedes format. Its only variable is the playback
  frequency (stored as the sample period in microseconds in the first byte
  of the file). The format is fixed as 8-bit linear signed (ie 2's
  complement). The only way to identify these samples is by file type as no
  charactersing information is stored in the file.

  Playback: Mono 8 (ie yes!)


ARMovie

  ARMovie is the soundtrack part of Acorn Replay files.

  Channels: mono or stereo

  Format: linear signed or unsigned (8 or 16 bit) or 8-bit VIDC

  Playback: Mono/stereo 8/16/VIDC (but only if the data is continuous. Sound
  is normally interleaved with video data so generally only sound-only
  ARMovies can be played.)


Sun Audio

  This format originated on Unix computers. 8khz mu-law is a very common
  sub-format for this although it can support many (only a couple of which
  are supported by SoundCon).

  Channels: Any number of channels can be represented (in theory, up to
  2^32). SoundCon can read up to 4 channels and write up to 2.

  Bits: This depends on the format. 8, 16, 24 or 32 can be represented.

  Format: Sun Audio now supports a large number of formats. Basically there
  is 8-bit, 8-bit A-law, compressed samples and various representations of
  8, 16, 24 and 32-bit linear samples. SoundCon can read 8-bit mu-law, 8-bit
  A-law and 8, 16, 24 and 32-bit linear samples with up to 4 channels. It can
  write 8-bit mu-law, 8-bit A-law 8 and 16-bit linear samples with up to 2
  channels.

  Playback: Mono/stereo 8 slin/16 slin/u-law/A-law
  
  
Datavox

  This is an Acorn specific format supporting most standard formats. Stereo
  and/or 16-bit samples have their bytes in different parts of file which
  makes this format difficult, and hence slow, for SoundCon to handle
  (SoundCon was designed for interleaved samples). Formats 1, 2 and 3 are
  recognised but saving is always done in format 3.
  
  Channels: mono or stereo (format 3 only).
  
  Bits: 8 or 16 (format 3 only).
  
  Format: VIDC, mu-law, s-lin 8/16 and u-lin 8/16. The 16-bit formats are
  only supported in format 3.
  
  Playback: Although SoundCon can play all of the supported formats, only the
  high byte of 16-bit samples and the left channel of stereo samples is used.
  This is because the bytes are not interleaved and the PlaySample module
  needs them to be.
    
  
Psion S3a

  This format is used on the Psion portable computers. It has a rigid format
  and there are no variables whatsoever. It is fixed at mono 8-bit A-law at
  8kHz.
  
  Channels: 1
  Bits: 8
  Format: A-law
  Playback: mono A-law


AudioWorks

  This is a versatile Archimedes format based on the Acorn Chunk File
  Format.

  Channels: Up to 255 channels in theory. SoundCon can read up to 4 channels
  and write up to 2. Note that for polyphonic samples samples are
  interleaved in blocks of 8.

  Bits: 8, 12 or 16 bits for linear formats, 8-bits for the others.

  Format: Linear signed, Linear unsigned, mu-law, A-law and Archimedes VIDC.

  Playback: Mono/stereo 8/16/u-law/A-law/VIDC


IFF/8SVX

  This is a cousin of Audio IFF (they both use the IFF chunk format) that
  specifically holds 8-bit mono samples (usually instrument voices). It
  supports section repeating and data compression; neither of which are
  supported by SoundCon.

  Playback: Mono 8 slin (ie yes)


VOC

  The Creative Voice format (VOC) is a PC sourced format similar to 8SVX. It
  can represent silence blocks, repeat blocks and compression. SoundCon can
  only handle trivial VOCs currently - mainly because I haven't been able to
  find any reliable information on this format.

  Playback: Mono 8 ulin. If the sample is made up of continued blocks there
  will be clicking in the playback (which SoundCon warns you of). This will
  not be present when it is converted to another format.


Voice module

  Voice modules are basically self-playing samples; they are stored in the
  mono VIDC format. The module provided a single command; Splay_xxx where
  xxx is the filename given when saved. Executing this command causes the
  module to claim a sound channel, play the sound at the originally
  sampled rate and release the channel when its finished. This is a
  transient playback since it only attaches itself to the sound system
  temporarily.

  There are three option sets provided by SoundCon:

  Transient/voice: A voice attaches itself to the system voice list when it
  is loaded so that it can be subsequently assigned to a sound channel and
  played. A transient voice does not declare itself as a voice and can only
  be used by issuing a star command.

  Fixed/variable frequency: Variable frequency sounds are normal instrument
  sounds that can have their pitch altered by the sound command. Fixed
  frequency mode is useful for 'effect' samples where the idea is to
  reproduce the original sound. In this case, the sample is played the
  sampled frequency and cannot be altered at play time.

  Volume mode: There are four volume modes:
  - Normal vol: The volume is controlled as normal by the channel volume in
    the sound command and by the overall system volume.
  - Sys vol: The channel volume in the sound command is ignored (maximum is
    used) so only the overall system volume affects the volume.
  - Channel vol: The overall system volume is ignored (works as if it was
    maximised) and the volume is controlled solely by the sound command.
  - Fixed vol: All volume settings are ignored. The volume used is that of
    playback slider in SoundCon when you initiate the save.

  Playback: Since this is a write-only format, SoundCon itself cannot play
  voice modules. They are self-playing.


RIFF WAVE

  The WAVE format is one of the many supported chunks in the RIFF 'chunked'
  file format which supports a wide range of sound sub-formats. This format
  originated on PCs so multi-byte numbers are little endian.

  Channels: Usually one or two supported although up to 65535 could
  theoretically be stored. SoundCon can read up to 4 channels and write up
  to 2.

  Bits: Depends on the format (4, 8 or 16).

  Format: WAVE supports numerous formats many of which are difficult to find
  information about. The ones supported by SoundCon are:
    PCM: 8 bit linear unsigned or 16 bit signed, mono or stereo.
    mu-law: 8-bit logarithmic format, mono or stereo.
    A-law: 8-bit pseudo logarithmic format, mono or stereo.
    MS ADPCM: 4-bit Microsoft ADPCM, mono or stereo.
    DVI ADPCM: 4-bit Intel ADPCM, mono or stereo.

  Playback: mono/stereo 8/16/mu-law/A-law/MSADPCM/DVI ADPCM (ie all those
  recognised)



ADPCM formats
=============

ADPCM (Adaptive Delta Pulse Code Modulation) is a way of storing sound in a
compressed format. It is not an algorithm per se but simply says that the
sound samples are encoded in some way using the differences between samples
(ie deltas). Pulse Code Modulation (PCM) is basically a fancy way of saying
'digitised sound'. There are many different ADPCM algorithms and SoundCon
supports the only two I could find information on.

Sound is typically very difficult to compress and standard compression
techniques like LZH will maybe save 25% if you're lucky. ADPCM algorithms
are lossy (imperfect) compression techniques garanteed to give a highish
compression ratio. Usually samples are stored as 4-bit encoded deltas (75%
saving on 16-bit samples) and the quality is good, though not perfect;
rather like JPEG picture compression.

Because ADPCM algorithms rely on local trends in the sound wave to predict
what the next sample will be, they do not encode noisy sounds (having a
random element) very well. They do well on things like large music files.
